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(57) Abstract 

A method of coding an audio signal comprises receiving an audio signal x to be coded and transforming the received signal from the 
time to the frequency domain. A quantised audio signal 7 is generated from the transformed audio signal x together with a set of long-term 
prediction coefficients A which can be used to predict a current time frame of the received audio signal directly from one or more previous 
time frames of the quantised audio signal x. A predicted audio signal X s is generated using the prediction coefficients A. The predicted 
audio signal 3? is then transformed from the time to the frequency domain and the resulting frequency domain signal compared with that of 
the received audio signal x to generate an error signal E(k) for each of a plurality of frequency sub-bands. The error signals E(k) are then 
quantised to generate a set of quantised error signals E(k) which are combined with the prediction coefficients A to generate a coded audio 
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Audio coding method and apparatus 

The present invention relates to a method and apparatus for audio coding and to a 
5 method and apparatus for audio decoding. 

It is well known that the transmission of data in digital form provides for increased 
signal to noise ratios and increased information capacity along the transmission 
channel. There is however a continuing desire to further increase channel 

1 0 capacity by compressing digital signals to an ever greater extent. In relation to 
audio signals, two basic compression principles are conventionally applied. The 
first of these involves removing the statistical or deterministic redundancies in the 
source signal whilst the second involves suppressing or eliminating from the 
source signal elements with are redundant insofar as human perception is 

15 concerned. Recently, the latter principle has become predominant in high quality 
audio applications and typically involves the separation of an audio signal into its 
frequency components (sometimes called "sub-bands") , each of which is analysed 
and quantised with a quantisation accuracy determined to remove data irrelevancy 
(to the listener). The ISO (International Standards Organisation) MPEG (Moving 

20 Pictures Expert Group) audio coding standard and other audio coding standards 
employ and further define this principle. However, MPEG (and other standards) 
also employs a technique know as "adaptive prediction" to produce a further 
reduction in data rate. 

25 The operation of an encoder according to the new MPEG-2 AAC standard is 
described in detail in the draft International standard document ISO/IEC DIS 
13818-7. This new MPEG-2 standard employs backward linear prediction with 
672 of 1024 frequency components. It is envisaged that the new MPEG-4 
standard will have similar requirements. However, such a large number of 

30 frequency components results in a large computational overhead due to the 
complexity of the prediction algorithm and also requires the availability of large 
amounts of memory to store the calculated and intermediate coefficients. It is well 
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known that when backward adaptive predictors of this type are used in the 
frequency domain, it is difficult to further reduce the computational loads and 
memory requirements. This is because the number of predictors is so large in the 
frequency domain that even a very simple adaptive algorithm still results in large 
5 computational complexity and memory requirements. Whilst it is known to avoid 
this problem by using forward adaptive predictors which are updated in the 
encoder and transmitted to the decoder, the use of forward adaptive predictors in 
the frequency domain inevitably results in a large amount of "side" information 
because the number of predictors is so large. 

10 

It is an object to the present invention to overcome or at least mitigate the 
disadvantages of known prediction methods. 

This and other objects are achieved by coding an audio signal using error signals 
15 to remove redundancy in each of a plurality of frequency sub-bands of the audio 
signal and in addition generating long term prediction coefficients in the time 
domain which enable a current frame of the audio signal to be predicted from one 
or more previous frames. 

20 According to a first aspect of the present invention there is provided a method of 
coding an audio signal, the method comprising the steps of: 
receiving an audio signal x to be coded; 

generating a quantised audio signal x from the received audio signal x ; 
generating a set of long-term prediction coefficients A which can be used 
25 to predict a current time frame of the received audio signal x directly from at least 
one previous time frame of the quantised audio signal x ; 

using the prediction coefficients A to generate a predicted audio signal x ; 
comparing the received audio signal x with the predicted audio signal x 
and generating an error signal £(fc)for each of a plurality of frequency sub-bands; 
30 quantising the error signals E(k) to generate a set of quantised error 

signals E{k) \ and 
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combining the quantised error signal E(k) and the prediction coefficients 
A to generate a coded audio signal. 

The present invention provides for compression of an audio signal using a forward 
5 adaptive predictor in the time domain. For each time frame of a received signal, it 
is only necessary to generate and transmit a single set of forward adaptive 
prediction coefficients for transmission to the decoder. This is in contrast to 
known forward adaptive prediction techniques which require the generation of a 
set of prediction coefficients for each frequency sub-band of each time frame. In 
10 comparison to the prediction gains obtained by the present invention, the side 
information of the long term predictor is negligible. 

Certain embodiments of the present invention enable a reduction in computational 
complexity and in memory requirements. In particular, in comparison to the use of 
15 backward adaptive prediction, there is no requirement to recalculate the prediction 
coefficients in the decoder. Certain embodiments of the invention are also able to 
respond more quickly to signal changes than conventional backward adaptive 
predictors. 

20 In one embodiment of the invention, the received audio signal x is transformed in 
frames x m from the time domain to the frequency domain to provide a set of 
frequency sub-band signals X(k). The predicted audio signal x is similarly 
transformed from the time domain to the frequency domain to generate a set of 
predicted frequency sub-band signals X(k) and the comparison between the 

25 received audio signal x and the predicted audio signal x is carried out in the 
frequency domain, comparing respective sub-band signals against each other to 
generate the frequency sub-band error signals E(k) . The quantised audio signal 
x is generated by summing the predicted signal and the quantised error signal, 
either in the time domain or in the frequency domain. 



30 



WO 98/42083 



PCT/FI98/00146 



4 

In an alternative embodiment of the invention, the comparison between the 
received audio signal x and the predicted audio signal x is carried out in the time 
domain to generate an error signal e also in the time domain. This error signal e 
is then converted from the time to the frequency domain to generate said plurality 
5 of frequency sub-band error signals E(k). 

Preferably, the quantisation of the error signals is carried out according to a 
psycho-acoustic model. 

10 According to a second aspect of the present invention there is provided a method 
of decoding a coded audio signal, the method comprising the steps of: 

receiving a coded audio signal comprising a quantised error signal E(k) for 
each of a plurality of frequency sub-bands of the audio signal and, for each time 
frame of the audio signal, a set of prediction coefficients A which can be used to 
15 predict a current time frame x m of the received audio signal directly from at least 
one previous time frame of a reconstructed quantised audio signal x ; 

generating said reconstructed quantised audio signal x from the quantised 
error signals E(k) ; 

using the prediction coefficients A and the quantised audio signal x to 
20 generate a predicted audio signal x ; 

transforming the predicted audio signal x from the time domain to the 
frequency domain to generate a set of predicted frequency sub-band signals 
X(k) for combining with the quantised error signals E(k) to generate a set of 
reconstructed frequency sub-band signals X(k) ; and 
25 performing a frequency to time domain transform on the reconstructed 

frequency sub-band signals X(k) to generate the reconstructed quantised audio 
signal x . 

Embodiments of the above second aspect of the invention are particularly 
30 applicable where only a sub-set of all possible quantised error signals E(k) are 
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received, some sub-band data being transmitted directly by the transmission of 
audio sub-band signals X(k) . The signals X(k) and X(k) are combined 
appropriately prior to carrying out the frequency to time transform. 

5 According to a third aspect of the present invention there is provided apparatus for 
coding an audio signal, the apparatus comprising: 

an input for receiving an audio signal x to be coded; 
quantisation means coupled to said input for generating from the received 
audio signal x a quantised audio signal x ; 
10 prediction means coupled to said quantisation means for generating a set 

of long-term prediction coefficients A for predicting a current time frame x m of the 
received audio signal x directly from at least one previous time frame of the 
quantised audio signal x ; 

generating means for generating a predicted audio signal x using the 
15 prediction coefficients A and for comparing the received audio signal x with the 
predicted audio signal x to generate an error signal E(k) for each of a plurality of 
frequency sub-bands; 

quantisation means for quantising the error signals E(k) to generate a set 

of quantised error signals E{k) ; and 
20 combining means for combining the quantised error signals E(k) with the 

prediction coefficients A to generate a coded audio signal. 

In one embodiment, said generating means comprises first transform means for 
transforming the received audio signal x from the time to the frequency domain 
25 and second transform means for transforming the predicted audio signal x from 
the time to the frequency domain, and comparison means arranged to compare 
the resulting frequency domain signals in the frequency domain. 

In an alternative embodiment of the invention, the generating means is arranged 
30 to compare the received audio signal x and the predicted audio signal x in the 
time domain. 
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According to a fourth aspect of the present invention there is provided apparatus 
for decoding a coded audio signal x , where the coded audio signal comprises a 
quantised error signal E(k) for each of a plurality of frequency sub-bands of the 
5 audio signal and a set of prediction coefficients A for each time frame of the 
audio signal and wherein the prediction coefficients A can be used to predict a 
current time frame x m of the received audio signal directly from at least one 
previous time frame of a reconstructed quantised audio signal x , the apparatus 
comprising: 

10 an input for receiving the coded audio signal; 

generating means for generating said reconstructed quantised audio signal 
x from the quantised error signals E(jfc); and 

signal processing means for generating a predicted audio signal x from the 
prediction coefficients A and said reconstructed audio signal x , 

15 wherein said generating means comprises first transforming means for 

transforming the predicted audio signal x from the time domain to the frequency 
domain to generate a set of predicted frequency sub-band signals X(k) , 
combining means for combining said set of predicted frequency sub-band signals 
X{k) with the quantised error signals E{k) to generate a set of reconstructed 

20 frequency sub-band signals X(k) , and second transforming means for performing 
a frequency to time domain transform on the reconstructed frequency sub-band 
signals X(k) to generate the reconstructed quantised audio signal x. 

For a better understanding of the present invention and in order to show how the 
25 same may be carried into effect reference will now be made, by way of example, 
to the accompanying drawings, in which: 

Figure 1 shows schematically an encoder for coding a received audio signal; 
Figure 2 shows schematically a decoder for decoding an audio signal coded with 
the encoder of Figure 1; 
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Figure 3 shows the encoder of Figure 1 in more detail including a predictor tool of 
the encoder; 

Figure 4 shows the decoder of Figure 2 in more detail including a predictor tool of 
the decoder; and 

5 Figure 5 shows in detail a modification to the encoder of Figure 1 and which 
employs an alternative prediction tool. 

There is shown in Figure 1 a block diagram of an encoder which performs the 

coding function defined in general terms in the MPEG-2 AAC standard. The input 
10 to the encoder is a sampled monophasic signal x whose sample points are 

grouped into time frames or blocks of 2N points, i.e. 

x m =(^(0),^(l),...,x m (2iV^l)) r (1) 

where m is the block index and T denotes transposition. The grouping of sample 

points is carried out by a filter bank tool 1 which also performs a modified discrete 
15 cosine transform (MDCT) on each individual frame of the audio signal to generate 

a set of frequency sub-band coefficients 

X m =(XJ0),XJl) 9 ...,X m (N-l)) T (2) 

The sub-bands are defined in the MPEG standard. 

The forward MDCT is defined by 

2N ~ ] n 
20 X/(0* m (i)cos(— (2i + l+JVX2* + l)J. 

where /(/) is the analysis-synthesis window, which is a symmetric window such 
that its added-overlapped effect is producing a unity gain in the signal. 

The frequency sub-band signals X(k) are in turn applied to a prediction tool 2 
25 (described in more detail below) which seeks to eliminate long term redundancy in 
each of the sub-band signals. The result is a set of frequency sub-band error 
signals 

E m (k) = (E m (0),EJl),.„,EJN-l)) T (4) 
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which are indicative of long term changes in respective sub-bands, and a set of 
forward adaptive prediction coefficients A for each frame. 

The sub-band error signals E(k) are applied to a quantiser 3 which quantises 
5 each signal with a number of bits determined by a psychoacoustic model. This 
model is applied by a controller 4. As discussed, the psychoacoustic model is 
used to model the masking behaviour of the human auditory system. The 
quantised error signals E(k) and the prediction coefficients A are then combined 
in a bit stream multiplexer 5 for transmission via a transmission channel 6. 

10 

Figure 2 shows the general arrangement of a decoder for decoding an audio 
signal coded with the encoder of Figure 1 . A bit-stream demultiplexer 7 first 
separates the prediction coefficients A from the quantised error signals E(k) and 
separates the error signals into the separate sub-band signals. The prediction 

15 coefficients A and the quantised error sub-band signals E(k) are provided to a 
prediction tool 8 which reverses the prediction process carried out in the encoder, 
i.e. the prediction tool reinserts the redundancy extracted in the encoder, to 
generate reconstructed quantised sub-band signals X(k) . A filter bank tool 9 
then recovers the time domain signal x , by an inverse transformation on the 

20 received version X(k) , described by 

x m (i ) = u m _ x {i + N) + u m (0, 

z = 0,".,7V-l 1 

where u k (/),/ = 0,—,2W - 1 are the inverse transform of X 

"m (0 = / (0 £ X„ (* ) C0S (TT7 (2i + 1 + W)(2fc + 1) ), 
i=0, -,2tf-l 
and which approximates the original audio signal x . 

25 

Figure 3 illustrates in more detail the prediction method of the encoder of Figure 1 . 
Using the quantised frequency sub-band error signals E(k) , a set of quantised 
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frequency sub-band signals X(k) are generated by a signal processing unit 10. 
The signals X(k) are applied in turn to a filter bank 1 1 which applies an inverse 
modified discrete cosine transform (IMDCT) to the signals to generate a quantised 
time domain signal x. The signal x is then applied to a long term predictor tool 
5 12 which also receives the audio input signal x . The predictor tool 12 uses a long 
term (LT) predictor to remove the redundancy in the audio signal present in a 
current frame m+1 , based upon the previously quantised data. The transfer 
function P of this predictor is: 

P(z)= X6 4 z- (0+4) (5) 

k=-m ] 

10 where a represents a long delay in the range 1 to 1024 samples and b k are 

prediction coefficients. For mi = m 2 = 0 the predictor is one tap whilst for mi = m 2 
= 1 the predictor is three tap. 

The parameters a and b k are determined by minimising the mean squared error 
15 after LT prediction over a period of 2N samples. For a one tap predictor, the LT 
prediction residual r(i) is given by: 

r(i) = X {i) - bx (i - IN + 1 - a) (6) 
where x is the time domain audio signal and x is the time domain quantised 
signal. The mean squared residual R is given by: 

2N-\ 2N-) 

20 R=^r 2 (i)=J j (x(i)-bx(i-2N + \-a)) 2 (7) 

Setting dR/db = 0 yields 

X;; o "40?(/-2Ar-fl-a) 
and substituting for b into equation (7) gives 



X«> (*("-2W + l-a)) 2 



0) 
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Minimizing R means maximizing the second term in the right-hand side of 
equation (9). This term is computed for all possible values of a over its specified 
range, and the value of a which maximizes this term is chosen. The energy in 
the denominator of equation (9), identified as £1 , can be easily updated from 
5 delay (a - 1) to a instead of recomputing it afresh using: 

= «a-, + x 2 (-a) - x 2 (-a + AO (10) 
If a one-tap LT predictor is used, then equation (8) is used to compute the 
prediction coefficient b s . For a j -tap predictor, the LT prediction delay a is first 

determined by maximizing the second term of Equation (9) and then a set of 
10 jx j equations is solved to compute the ; prediction coefficients. 

The LT prediction parameters A are the delay a and prediction coefficient b- . 

The delay is quantized with 9 to 1 1 bits depending on the range used. Most 
commonly 10 bits are utilized, with 1024 possible values in the range 1 to 1024. 
15 To reduce the number of bits, the LT prediction delays can be delta coded in even 
frames with 5 bits. Experiments show that it is sufficient to quantize the gain with 
3 to 6 bits. Due to the nonuniform distribution of the gain, nonuniform quantization 
has to be used. 

20 In the method described above, the stability of the LT synthesis filter 1 / P(z) is not 
always guaranteed. For a one-tap predictor, the stability condition is \b\ < 1 . 
Therefore, the stabilization can be easily carried out by setting \b\ = 1 whenever 

> 1 . For a 3-tap predictor, another stabilization procedure can be used such as 
is described in R.P. Ramachandran and P. Kabal, "Stability and performance 

25 analysis of pitch filters in speech coders," IEEE Trans. ASSP, vol. 35, no.7, 

pp.937-946, July 1987. However, the instability of the LT synthesis filter is not that 
harmful to the quality of the reconstructed signal. The unstable filter will persist for 
a few frames (increasing the energy), but eventually periods of stability are 
encountered so that the output does not continue to increase with time. 



30 
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After the LT predictor coefficients are determined, the predicted signal for the 
(m+1)th frame can be determined: 



*(0= X^Ki^JV + l-j-o), 

i = mN + UmN + 2,...,(/n + 1)W 
The predicted time domain signal i is then applied to a filter bank 13 which 
5 applies a MDCT to the signal to generate predicted spectral coefficients X m+l (k) 
for the (m+1 )th frame. The predicted spectral coefficients X(k) are then 
subtracted from the spectral coefficients X(k) at a subtractor 14. 

In order to guarantee that prediction is only used if it results in a coding gain, an 

10 appropriate predictor control is required and a small amount of predictor control 
information has to be transmitted to the decoder. This function is carried out in 
the subtractor 14. The predictor control scheme is the same as for the backward 
adaptive predictor control scheme which has been used in MPEG-2 Advanced 
Audio Coding (AAC). The predictor control information for each frame, which is 

15 transmitted as side information, is determined in two steps. Firstly, for each 

scalefactor band it is determined whether or not prediction leads to a coding gain 
and if yes, the predictor_used bit for that scalefactor band is set to one. After 
this has been done for all scalefactor bands, it is determined whether the overall 
coding gain by prediction in this frame compensates at least the additional bit 

20 need for the predictor side information. If yes, the predictor_data_present bit is 
set to 1 and the complete side information including that needed for predictor 
reset is transmitted and the prediction error value is fed to the quantizer. 
Otherwise, the predictor_data_present bit is set to 0 and the prediction _used 
bits are all reset to zero and are not transmitted. In this case, the spectral 

25 component value is fed to the quantizer 3. As described above, the predictor 

control first operates on all predictors of one scalefactor band and is then followed 
by a second step over all scalefactor bands. 
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It will be apparent that the aim of LT prediction is to achieve the largest overall 
prediction gain. Let G t denote the prediction gain in the / th frequency sub-band. 
The overall prediction gain in a given frame can be calculated as follows: 

C= ! G, (12) 

/=i &. (G,>0) 

5 If the gain compensates the additional bit need for the predictor side information, 
i.e., G > J (dB) , the complete side information is transmitted and the predictors 

which produces positive gains are switched on. Otherwise, the predictors are not 
used. 

The LP parameters obtained by the method set out above are not directly related 
to maximising the gain. However, by calculating the gain for each block and for 
each delay within the selected range (1 to 1024 in this example), and by selecting 
that delay which produces the largest overall prediction gain, the prediction 
process is optimised. The selected delay a and the corresponding coefficients b 
are transmitted as side information with the quantised error sub-band signals. 
Whilst the computational complexity is increased at the encoder, no increase in 
complexity results at the decoder. 

Figure 4 shows in more detail the decoder of Figure 2. The coded audio signal is 
20 received from the transmission channel 6 by the bitstream demultiplexer 7 as 
described above. The bitstream demultiplexer 7 separates the prediction 
coefficients A and the quantised error signals E(k) and provides these to the 
prediction tool 8. This tool comprises a combiner 24 which combines the 
quantised error signals E{k) and a predicted audio signal in the frequency domain 
25 X(k) to generate a reconstructed audio signal X(k) also in the frequency 
domain. The filter bank 9 converts the reconstructed signal X(k) from the 
frequency domain to the time domain to generate a reconstructed time domain 
audio signal x . This signal is in turn fed-back to a long term prediction tool which 
also receives the prediction coefficients A . The long term prediction tool 26 



10 
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generates a predicted current time frame from previous reconstructed time frames 
using the prediction coefficients for the current frame. A filter bank 25 transforms 
the predicted signal x . 

5 It will be appreciated the predictor control information transmitted from the 
encoder may be used at the decoder to control the decoding operation. In 
particular, the predictor_used bits may be used in the combiner 24 to determine 
whether or not prediction has been employed in any given frequency band. 

10 There is shown in Figure 5 an alternative implementation of the audio signal 

encoder of Figure 1 in which an audio signal x to be coded is compared with the 
predicted signal x in the time domain by a comparator 15 to generate an error 
signal e also in the time domain. A filter bank tool 16 then transforms the error 
signal from the time domain to the frequency domain to generate a set of 

15 frequency sub-band error signals E(k) . These signals are then quantised by a 

quantiser 17 to generate a set of quantised error signals E(k) . 

A second filter bank 18 is then used to convert the quantised error signals E{k) 
back into the time domain resulting in a signal e . This time domain quantised 

20 error signal e is then combined at a signal processing unit 19 with the predicted 
time domain audio signal x to generate a quantised audio signal x . A prediction 
tool 20 performs the same function as the tool 12 of the encoder of Figure 3, 
generating the predicted audio signal x and the prediction coefficients A . The 
prediction coefficients and the quantised error signals are combined at a bit 

25 stream multiplexer 21 for transmission over the transmission channel 22. As 
described above, the error signals are quantised in accordance with a psycho- 
acoustical model by a controller 23. 

The audio coding algorithms described above allow the compression of audio 
30 signals at low bit rates. The technique is based on long term (LT) prediction. 

Compared to the known backward adaptive prediction techniques, the techniques 
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described here deliver higher prediction gains for single instrument music signals 
and speech signals whilst requiring only low computational complexity. 
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Claims 

1 . A method of coding an audio signal, the method comprising the steps of: 
receiving an audio signal x to be coded; 

generating a quantised audio signal x from the received audio signal x ; 
5 generating a set of long-term prediction coefficients A which can be used 

to predict a current time frame of the received audio signal directly from at least 
one previous time frame of the quantised audio signal x ; 

using the prediction coefficients A to generate a predicted audio signal x ; 
comparing the received audio signal x with the predicted audio signal jc 
1 0 and generating an error signal £(fc)for each of a plurality of frequency sub-bands; 
quantising the error signals E(k) to generate a set of quantised error 
signals E(k) ; and 

combining the quantised error signals E(k) and the prediction coefficients 
A to generate a coded audio signal. 

15 

2. A method according to claim 1 and comprising transforming the received 
audio signal x in frames x m from the time domain to the frequency domain to 
provide a set of frequency sub-band signals X(k) and transforming the predicted 
audio signal x from the time domain to the frequency domain to generate a set of 

20 predicted frequency sub-band signals X(k) , wherein the comparison between the 
received audio signal x and the predicted audio signal x is carried out in the 
frequency domain, comparing respective sub-band signals against each other to 
generate the frequency sub-band error signals E(k). 

25 3. A method according to claim 1 and comprising carrying out the comparison 
between the received audio signal x and the predicted audio signal x in the time 
domain to generate an error signal ealso in the time domain and converting the 
error signal e from the time to the frequency domain to generate said plurality of 
frequency sub-band error signals E(k). 

30 
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4. A method of decoding a coded audio signal, the method comprising the 
steps of: 

receiving a coded audio signal comprising a quantised error signal E(k) for 
each of a plurality of frequency sub-bands of the audio signal and, for each time 
5 frame of the audio signal, a set of prediction coefficients A which can be used to 
predict a current time frame x m of the received audio signal directly from at least 
one previous time frame of a reconstructed quantised audio signal x ; 

generating said reconstructed quantised audio signal x from the quantised 
error signals E(k) ; 

10 using the prediction coefficients A and the quantised audio signal x to 

generate a predicted audio signal x ; 

transforming the predicted audio signal x from the time domain to the 
frequency domain to generate a set of predicted frequency sub-band signals 
X(k) for combining with the quantised error signals E(k) to generate a set of 
15 reconstructed frequency sub-band signals X(k) ; and 

performing a frequency to time domain transform on the reconstructed 
frequency sub-band signals X(k) to generate the reconstructed quantised audio 
signal x . 

20 5. Apparatus for coding an audio signal, the apparatus comprising: 

an input for receiving an audio signal x to be coded; 

processing means (2,3; 15-1 9) coupled to said input for generating from the 
received audio signal x a quantised audio signal x ; 

prediction means (12;19) coupled to said processing means (3) for 
25 generating a set of long-term prediction coefficients A for predicting a current time 
frame x m of the received audio signal x directly from at least one previous time 
frame of the quantised audio signal 3c ; 

generating means (10-14;20,15) for generating a predicted audio signal x 
using the prediction coefficients A and for comparing the received audio signal x 
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with the predicted audio signal x to generate an error signal E(k) for each of a 
plurality of frequency sub-bands; 

quantisation means (3;17) for quantising the error signals E(k) to generate 
a set of quantised error signals E(k) ; and 
5 combining means (5;21) for combining the quantised error signals E(k) 

with the prediction coefficients A to generate a coded audio signal. 

6. Apparatus according to claim 5, wherein said generating means comprises 
first transform means (11) for transforming the received audio signal x from the 
10 time to the frequency domain and second transform means (1 3) for transforming 
the predicted audio signal 5 from the time to the frequency domain, and 
comparison means (14) arranged to compare the resulting frequency domain 
signals in the frequency domain. 

15 7. Apparatus according to claim 6, wherein the generating means is arranged 
to compare the received audio signal x and the predicted audio signal x in the 
time domain. 

8. Apparatus for decoding a coded audio signal x , where the coded audio 
20 signal comprises a quantised error signal E{k) for each of a plurality of frequency 
sub-bands of the audio signal and a set of prediction coefficients A for each time 
frame of the audio signal and wherein the prediction coefficients A can be used to 
predict a. current time frame x m of the received audio signal directly from at least 
one previous time frame of a reconstructed quantised audio signal x , the 
25 apparatus comprising: 

an input for receiving the coded audio signal; 

generating means (24,25,9) for generating said reconstructed quantised 
audio signal x from the quantised error signals E(k) \ and 

signal processing means (26) for generating a predicted audio signal x 
30 from the prediction coefficients A and said reconstructed audio signal x , 
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wherein said generating means comprises first transforming means (25) for 
transforming the predicted audio signal x from the time domain to the frequency 
domain to generate a set of predicted frequency sub-band signals X(k) , 
combining means (24) for combining said set of predicted frequency sub-band 
5 signals X(k) with the quantised error signals E{k) to generate a set of 

reconstructed frequency sub-band signals X(k) , and second transforming means 
(9) for performing a frequency to time domain transform on the reconstructed 
frequency sub-band signals X{k) to generate the reconstructed quantised audio 
signal x . 
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Figure 1 
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Figure 2 
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